feat(graph): add suggest-base subcommand with collection impact analysis#1103
feat(graph): add suggest-base subcommand with collection impact analysis#1103dhellmann wants to merge 3 commits into
Conversation
📝 WalkthroughWalkthroughAdds a new CLI subcommand to find packages shared across multiple collection graphs. Implements helpers to load canonical package names (excluding ROOT), compute candidate packages appearing in at least a threshold of collections, calculate per-collection impact when proposed base packages are removed, and render results as a rich table or JSON. Adds Click wiring with --base, --min-collections, and --format options and comprehensive tests for helpers, output formats, base handling, and validation/error cases. Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Here's some sample output for some downstream collections. Using the command: Output (click to expand)produced this output: |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/fromager/commands/graph.py`:
- Around line 1050-1082: base-only (orphan) packages from the supplied base
graph are not computed or emitted; compute orphan_packages = base_packages -
union(*collections) after base_packages is populated, then pass this
orphan_packages into the output helpers (_suggest_base_json and
_suggest_base_table) so the JSON/table include base packages that no analyzed
collection references; update the calls to _suggest_base_json and
_suggest_base_table to accept the new orphan_packages argument and update those
functions' signatures to render the orphan set (and keep existing fields like
in_base annotation intact).
- Around line 1041-1048: The collections dict is keyed by a non-unique short
name (from _get_collection_name(path)), causing collisions; instead key the dict
by a unique identifier (e.g., the collection file path or its resolved string)
while still computing and storing the short/display name from
_get_collection_name(path) for output. In the loop that iterates
collection_graphs, use a unique key like str(path.resolve()) or path.as_posix()
for collections[key] = pkgs and keep name = _get_collection_name(path) only for
logging/display, ensuring no silent overwrites of other collections.
In `@tests/test_graph_commands.py`:
- Around line 664-688: The tests test_suggest_base_too_few_graphs and
test_suggest_base_invalid_min_collections are currently catching any Exception;
change both pytest.raises calls to specifically expect click.UsageError (e.g.,
pytest.raises(click.UsageError, match="...")) so the validation path is
asserted; also ensure the test module imports click (add "import click" at top
if missing) and keep the existing match strings unchanged to verify the error
message.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: c3d964ed-e7b1-4950-a7d6-57e1d8d093f0
📒 Files selected for processing (2)
src/fromager/commands/graph.pytests/test_graph_commands.py
99b5601 to
ce90870
Compare
|
I was planning to work this up, but held off because #1030 hasn't reached consensus yet and wanted to avoid churn. Also, I was thinking the two commands share some helpers, so I was waiting to align on the approach first. Thanks for getting this done though -- glad it's moving forward! 😄 |
Analyzes multiple collection graph files to find packages that appear across >= N collections. These shared packages are candidates for a "base" collection built once and reused across parallel collection builds. Also computes collection impact analysis showing how the proposed base would affect each input collection: remaining package counts per collection and cross-collection counts for remaining packages, helping identify candidates for a secondary base. New public command: fromager graph suggest-base GRAPH [GRAPH ...] Options: --min-collections INT threshold (default: 50% of input graphs, rounded up) --base PATH mark packages already in an existing base graph --format table|json output format (default: table) Implementation: - _get_collection_packages(): load graph, return canonical names - _find_shared_packages(): find overlap, sort by count desc then name asc - _compute_collection_impact(): per-collection remaining package analysis - _suggest_base_table(): rich MARKDOWN table output with impact sections - _suggest_base_json(): structured JSON with metadata, candidates, and collection_impact key - _suggest_base_impl(): testable core extracted from the click command Tests: 13 new unit tests covering helpers, table/JSON output, --base flag, impact analysis, and error cases. Closes: python-wheel-build#973 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Doug Hellmann <dhellmann@redhat.com>
- Remove unused @click.pass_obj / wkctx from suggest_base - Use clickext.ClickPath(exists=True) for --base and collection_graphs arguments so paths are validated early and returned as pathlib.Path - Use click.IntRange(min=2) for --min-collections, removing the manual lower-bound check from _suggest_base_impl - Add post-filter validation: raise UsageError when fewer than 2 non-empty graphs remain after skipping empty collections - Update type annotations throughout (_suggest_base_impl, _suggest_base_json, _get_collection_name, _get_collection_packages) - Simplify resolved-path key computation to str(path.resolve()) - Serialize base_graph as str in JSON output Tests: - _make_graph_file now returns pathlib.Path - Update base_graph assertion in test_suggest_base_with_base_graph - Add test_suggest_base_all_but_one_empty for post-filter validation Co-Authored-By: Claude <claude@anthropic.com> Signed-off-by: Doug Hellmann <dhellmann@redhat.com>
Rename the `graph suggest-base` subcommand to `graph find-common-dependencies` for clarity. Updates all internal helpers and tests to match. Co-Authored-By: Claude <claude@anthropic.com> Signed-off-by: Doug Hellmann <dhellmann@redhat.com>
ce90870 to
b3ce857
Compare
|
@smoparth I renamed the command here |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/fromager/commands/graph.py`:
- Line 1146: The command function currently named find_common_dependencies
registers as "fromager graph find-common-dependencies" but should be "fromager
graph suggest-base"; rename the function to suggest_base (and any
decorator/registration reference that uses the function name) and update all
internal references/calls (imports, tests, help text, and any places that call
find_common_dependencies) to use suggest_base so the CLI command matches issue
`#973` and the PR title.
- Around line 1152-1157: Add a Sphinx `versionadded` directive to the docstring
for the command that begins "Suggest packages for a shared base collection" (the
command function/class in src/fromager/commands/graph.py). Edit that docstring
to include a line like “.. versionadded:: X.Y.Z” (replace X.Y.Z with the release
version) under the summary/description so the new user-facing command is
documented per guidelines. Ensure the directive uses correct indentation and is
placed before any parameter or examples sections.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: e1feb29e-3138-4b16-800c-c85de826b14e
📒 Files selected for processing (2)
src/fromager/commands/graph.pytests/test_graph_commands.py
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/test_graph_commands.py
| @click.argument( | ||
| "collection_graphs", nargs=-1, required=True, type=clickext.ClickPath(exists=True) | ||
| ) | ||
| def find_common_dependencies( |
There was a problem hiding this comment.
Command name doesn't match requirements.
The function name find_common_dependencies will register the command as fromager graph find-common-dependencies, but the PR objectives (issue #973) and PR title specify the command should be fromager graph suggest-base.
📝 Rename command to match specification
-def find_common_dependencies(
+def suggest_base(
collection_graphs: tuple[pathlib.Path, ...],
base_graph: pathlib.Path | None,
min_collections: int | None,
output_format: str,
) -> None:Also update the calls in the implementation if the internal function names reference this.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/fromager/commands/graph.py` at line 1146, The command function currently
named find_common_dependencies registers as "fromager graph
find-common-dependencies" but should be "fromager graph suggest-base"; rename
the function to suggest_base (and any decorator/registration reference that uses
the function name) and update all internal references/calls (imports, tests,
help text, and any places that call find_common_dependencies) to use
suggest_base so the CLI command matches issue `#973` and the PR title.
| """Suggest packages for a shared base collection. | ||
|
|
||
| Analyzes COLLECTION_GRAPHS (2 or more graph files) to identify packages | ||
| appearing across multiple collections. These are candidates for factoring | ||
| into a base collection built once and reused. | ||
| """ |
There was a problem hiding this comment.
Add Sphinx versionadded directive to docstring.
User-facing commands should include Sphinx directives per coding guidelines. As per coding guidelines, use versionadded directive for this new command.
📚 Add version directive
"""Suggest packages for a shared base collection.
Analyzes COLLECTION_GRAPHS (2 or more graph files) to identify packages
appearing across multiple collections. These are candidates for factoring
into a base collection built once and reused.
+
+ .. versionadded:: <version>
"""Replace <version> with the appropriate version number for this release.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| """Suggest packages for a shared base collection. | |
| Analyzes COLLECTION_GRAPHS (2 or more graph files) to identify packages | |
| appearing across multiple collections. These are candidates for factoring | |
| into a base collection built once and reused. | |
| """ | |
| """Suggest packages for a shared base collection. | |
| Analyzes COLLECTION_GRAPHS (2 or more graph files) to identify packages | |
| appearing across multiple collections. These are candidates for factoring | |
| into a base collection built once and reused. | |
| .. versionadded:: <version> | |
| """ |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/fromager/commands/graph.py` around lines 1152 - 1157, Add a Sphinx
`versionadded` directive to the docstring for the command that begins "Suggest
packages for a shared base collection" (the command function/class in
src/fromager/commands/graph.py). Edit that docstring to include a line like “..
versionadded:: X.Y.Z” (replace X.Y.Z with the release version) under the
summary/description so the new user-facing command is documented per guidelines.
Ensure the directive uses correct indentation and is placed before any parameter
or examples sections.
Summary
graph suggest-basecommand to identify packages shared acrossmultiple collections as candidates for a base collection
after removing base candidates, and cross-collection counts to help
identify secondary base candidates
collection_impactsectionin both
Test Plan
hatch run test:test tests/test_graph_commands.pyhatch run lint:fix && hatch run mypy:check src/fromager/commands/graph.pyfromager graph suggest-base graphs/*.json 2>/dev/null | lessfromager graph suggest-base --format json graphs/*.json 2>/dev/null | python3 -m json.tool | lessCloses: #973
🤖 Generated with Claude Code